GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems

نویسندگان

چکیده

Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited local optima. This work proposes GoSafeOpt as the first provably safe and algorithm safely discover globally for with high-dimensional state space. We demonstrate superiority of over competing in simulation experiments robot arm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Safe Exploration for Identifying Linear Systems via Robust Optimization

Safely exploring an unknown dynamical system is critical to the deployment of reinforcement learning (RL) in physical systems where failures may have catastrophic consequences. In scenarios where one knows little about the dynamics, diverse transition data covering relevant regions of state-action space is needed to apply either model-based or model-free RL. Motivated by the cooling of Google’s...

متن کامل

PROJECTED DYNAMICAL SYSTEMS AND OPTIMIZATION PROBLEMS

We establish a relationship between general constrained pseudoconvex optimization problems and globally projected dynamical systems. A corresponding novel neural network model, which is globally convergent and stable in the sense of Lyapunov, is proposed. Both theoretical and numerical approaches are considered. Numerical simulations for three constrained nonlinear optimization problems a...

متن کامل

Global Optimization using a Dynamical Systems Approach

We develop new algorithms for global optimization by combining well known branch and bound methods with multilevel subdivision techniques for the computation of invariant sets of dynamical systems. The basic idea is to view iteration schemes for local optimization problems — e.g. Newton’s method or conjugate gradient methods — as dynamical systems and to compute set coverings of their fixed poi...

متن کامل

Safe Exploration for Optimization with Gaussian Processes

We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multiarmed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified “...

متن کامل

observational dynamical systems

چکیده در این پایاننامه ابتدا فضاهای متریک فازی را به صورت مشاهدهگرایانه بررسی میکنیم. فضاهای متریک فازی و توپولوژی تولید شده توسط این متریک معرفی شدهاند. سپس بر اساس فضاهایی که در فصل اول معرفی شدهاند آشوب توپولوژیکی، مینیمالیتی و مجموعههای متقاطع در شیوههای مختلف بررسی شده- اند. در فصل سوم مفهوم مجموعههای جاذب فازی به عنوان یک مفهوم پایهای در سیستمهای نیم-دینامیکی نسبی، تعریف شده است. ...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Artificial Intelligence

سال: 2023

ISSN: ['2633-1403']

DOI: https://doi.org/10.1016/j.artint.2023.103922